CS 260 : Machine Learning Theory Lecture 14 : Generalization Error of AdaBoost

نویسنده

  • Jennifer Wortman Vaughan
چکیده

We saw last time that the training error of AdaBoost decreases exponentially as the number of rounds T grows. However, this says nothing about how well the function output by AdaBoost performs on new examples. Today we will discuss the generalization error of AdaBoost. We know that AdaBoost gives us a consistent function quickly; the bound we derived on training error decreases exponentially, and once this bound drops below 1/m, we know we must have a consistent function. Because of this, if we were able to find the VC dimension of the class of functions from which AdaBoost chooses, we’d be able to apply our results from the first few weeks of class to get a bound on generalization error. LetH be the class of functions from which the weak learning algorithm A chooses, and let d be the VC dimension of this class. The class of functionsH′ from which AdaBoost chooses is the class of all functions h that can be written as

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CS 269 : Machine Learning Theory Lecture 14 : Generalization Error of Adaboost

In this lecture we will continue our discussion of the Adaboost algorithm and derive a bound on the generalization error. We saw last time that the training error decreases exponentially with respect to the number of rounds T . However, we also want to see the performance of this algorithm on new test data. Today we will show why the Adaboost algorithm generalizes so well and why it avoids over...

متن کامل

The Boosting Approach to Machine Learning An Overview

Boosting is a general method for improving the accuracy of any given learning algorithm. Focusing primarily on the AdaBoost algorithm, this chapter overviews some of the recent work on boosting including analyses of AdaBoost’s training error and generalization error; boosting’s connection to game theory and linear programming; the relationship between boosting and logistic regression; extension...

متن کامل

Consistency of Nearest Neighbor Methods

In this lecture we return to the study of consistency properties of learning algorithms, where we will be interested in the question of whether the generalization error of the function learned by an algorithm approaches the Bayes error in the limit of infinite data. In particular, we will consider consistency properties of the simple k-nearest neighbor (k-NN) classification algorithm (in the ne...

متن کامل

Adaboost and Learning Algorithms: an Introduction

This article will give a general overview of boosting and in particular AdaBoost. AdaBoost is the most popular boosting algorithm. It has been shown to have very interesting properties such as low generalization error as well as an exponentially decreasing bound on the training error. The article will also give a short introduction to learning algorithms.

متن کامل

A Refined Margin Analysis for Boosting Algorithms via Equilibrium Margin

Much attention has been paid to the theoretical explanation of the empirical success of AdaBoost. The most influential work is the margin theory, which is essentially an upper bound for the generalization error of any voting classifier in terms of the margin distribution over the training data. However, important questions were raised about the margin explanation. Breiman (1999) proved a bound ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011